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, Abstract 

Methods for the statistical characterization of the large-scale structure 
in the Universe will be the main topic of the present text. The focus is 
on geometrical methods, mainly Minkowski functionals and the J-function. 
Their relations to standard methods used in cosmology and spatial statistics 
1 and their application to cosmological datasets will be discussed. A short 

I introduction to the standard picture of cosmology is given. 
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1 Introduction 

On ' 

A fundamental problem in cosmology is to understand the formation of the large- 
scale structure in the Universe. Normally theoretical models of large-scale struc- 
Oh' turc, whether involving analytical predictions or numerical simulations, are based 

q , on some form of random or stochastic initial conditions. This means that a statis- 

tical interpretation of clustering data is required, and that statistical tools must be 
r/3 , deployed in order to discriminate between different cosmological models. Moreover 

the identification and characterization of specific geometric features in the galaxy 
distribution like walls, filaments, and clusters will deepen our understanding of 
structure formation, assist in the construction of approximations and also help to 
constrain cosmological models. 

During the past two decades enormous progress has been made in the mapping 
of the distribution of galaxies in the Universe. Using the measured redshifts of 
galaxies as distance indicators, and knowing their angular positions on the sky, we 
can obtain a three-dimensional view of the distribution of luminous matter in the 
Universe. Presently available redshift surveys already permit the detailed study of 
the statistical properties of the spatial distribution of galaxies. Surveys of galaxy 
redshifts that cover reasonable solid angles and are significantly deeper than those 
presently available present important challenges, and not just for the observers. A 
precise definition of the statistical methods is needed to extract most out of the 
costly data, and this is an important goal for theorists. 

A complete review of the variety of statistical methods used in cosmology is not 
attempted. The focus of this overview will be on methods of point process statistics 
using geometrical ideas like Minkowski functionals and the J-function; moment 
based methods will also be mentioned. For reviews with a different emphasis see 
e.g. jPeebles (1980|), pertschingcr (1992| ), peacock (1992| ), |Borgani (1995| ), |Efstathiou 



;i996p , and |Martmcz (1996| ) 



This text is organized as follows: 
In Sect. U we will give a short introduction to the common theoretical "prejudice" 
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Figure 1: Projection of the temperature fluctuations in the microwave background 
radiation as observed by the COBE satellite (from Bchmalzing and Gorski 1998| ). 
The relative fluctuations are of the order of 10 -5 . 



in cosmology and describe some observational issues. We briefly comment on two- 
point correlations (Sect. [O]) and moment based methods (Sect. |3.2| ), and focus on 
Minkowski functionals (Sect. |3.3| ) and the J-function, as well as its extensions the 
Jrj-functions (Sect. |3.4| ). In Sect. |^ we summarize and provide an outlook. 



2 Cosmological models and observations 

Most cosmological models stud ied today are based on the assu mption of hom ogene- 
ity and isotropy (see however Buchert and Ehlcrs 1997 and Buchcrt 1999] ). Ob- 
servationally one can find evidence that supports these assumptions on very large 
scales, the strongest being the almost perfect isotropy of the cosmic microwave back- 
ground radiation (after assigning the whole dipole to our proper motion relative to 
this background). The relative temperature fluctuations over the sky are of the or- 
der of 10~ 5 as shown in Fig. [l| This tells us that the Universe was nearly isotropic 
and, with some additional assumptions, homogeneous at the time of decoupling of 
approximately 13Gy (Giga years) ago. 

For such a highly symmetric situation the universal expansion may be described 
by a position vector x#(i) at time t that can be calculated from the initial position 



x H {t) = a(t) Xi 



(1) 



using the scale factor a(t) with a(ti) = 1. The dynamical evolution of a(t) is 
determined by the Friedmann equations (see e.g. Padmanabhan 1993). As a direct 
consequence the velocities may be approximated by the Hubble law, 



v H (t)=H(t) xjr(t) 



(2) 



relating the distance vector xjj(i) with the velocity vjy(i) by the Hubble parameter 
H(t) = a(t)/a(t). Indeed such a mainly linear relationship is observed for galaxies 
(see Fig. ||). The deviations visible may be assigned to peculiar motions, as caused 
by mass density perturbations. 

However, on small and on intermediate scales up to several hundreds of Mpcs, 
there are significant deviations from homogeneity and isotropy as visible in the 
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Figure 2: Hubble law for galaxy clusters and groups taken from ( Bandage, 199-l| ). 
The x-axis is proportional to distance indicator obtained from the a certain lumi- 
nosity of the clusters and groups, whereas the y-axis is proportional to the redshift. 



spatial distribution of galaxies. (Mega parsec (Mpc) is the common unit of length 
in cosmological applications with lpc=3.26 light years.) Large holes, filamentary 
as well as wall-like structures are observed (Fig. see also sect. 3.3.4). 



One of the goals in cosmology is to understand how these large scale structures 
form, given a nearly homogeneous and isotropic matter distribution at some early 
time. In the Newtonian approximation the process of structure formation is modeled 
using a self gravitating pressure-less fluid, with the mass density p(x, t) and the 
velocity field v(x, t): 

d t g + V{gv)=0, 
<9 t v + (v • V)v = g, 

V x g = 0, (3) 
V • g = -4nGg. 

The first equation is the continuity equation, stating mass-conservation, the second 
comes from momentum conservation with the gravitational acceleration g(x, i) self- 
consistently determined from the mass density. With small fluctuations in g and v 
given at some early time, this system of partial differential equations constitutes a 
highly non-linear initial value problem. Up to now no general solution is known. 
Approximate solutions may be constructed using a perturbative expansion around 
the homogeneous background solutions either for the fields g and v directly or 
for the characteristics. The first one is called Eulerian perturbation theory (see 



e.g. Peebles 1980), whereas the second is named Lagrangian perturbation theory 



(see e.g. Buchcrt 1996). Also numerical integration with N-body simulations is 
used. 

The initial conditions are often chosen as realization of a Gaussian random 
field for the density contrast (g — qh)I Qh- In principle a Gaussian random field 
model for the density contrast allows for unphysical negative mass densities, however 
we find that the initial fluctuations in the mass density are by a factor of 10 5 - 
times smaller than the mean value of the field, and therefore negative densities 
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Figure 3: In the upper two panels, the position of the galaxies in two neighboring 
slices with an angular extent of 135 x 5 deg 2 , and a maximum distance of 120ft. -1 Mpc 
from our galaxy which is located at the tip of the cone. The galaxies are shown 
projected along the angular coordinate spanning only 5deg. In the lower plot b oth 
slices are shown pr ojected on top of each other (data from Huchra et al. 199C and 
Euchra et al. 1995|). 
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Figure 4: The left figure illustrates the Poisson model, whereas the right figure 
shows the peak selection for the same density field. 



are practically excluded. Using the methods mentioned above we can follow the 
nonlinear time evolution of the density field, leading to a highly non-Gaussian field. 
In this evolved mass density field galaxies are identified sometimes also utilizing the 
velocity field. Moreover, our understanding of the physical processes determining 
the galaxy formation is still limited. 

Two popular stochastic models used to describe the distribution of galaxies are 
the Poisson model and the peak selection. In the Poisson model we assume that 
the mean number of galaxies inside a region C is directly proportional to the total 
mass inside this region (see e.g. Peebles 198C| , often also called Poisson sampling). 
Hence the intensity measure A(C) - the mean number of galaxies inside C - is 



A(C) cx / dx g(x). 
Jc 



(4) 



If the mass density g is modeled as a random field the Poisson model results in a 
double-stochastic point process, i.e. a Cox process ( Stoyan et al., 1995 ). 

Within the peak selection mod el, galaxies appear on ly at the peaks of the density 
field above some given threshold ( Bardeen et al., 1986 ). This model is an example 
for an "interrupted point process" ( Btoyan et al., 1995 ). In Figure § we illustrate 
both models in the one-dimensional case. There are also dynamically and micro- 
physically motivated models for the identification of galaxies in simulations we do 
not cover here flKates et al. 199l| , |Wcifi and Buchert 199^ , [Kauffmann et al. 1997| ). 

As we have seen several "parameters" enter these partly deterministic, partly 
stochastic models for the galaxy distribution. Before describing the statistical meth- 
ods used to constrain these parameters, typical observational problems entering the 
construction of galaxy catalogues will be mentioned. 

The starting point is the two-dimensional distribution of galaxies on the celestial 
sphere. Their angular positions are known to a high precision compared to their 
radial distance r. In most galaxy catalogues the radial distance is estimated utilizing 
the redshift: 



Mab 



Alab 



(5) 



with the observed wavelength of a spectral line A b s and with the wavelength of 
the same spectral line measured in a laboratory Ai a b- Out to several hundreds of 
Mpc's the relation between the radial distance r and the redshift z is to a good 
approximation 



cz w \vh \ + u = H r + u, 



(6) 
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Figure 5: In the left figure th e spatial distributio n of the galaxies taken from the 
IRAS 1.2Jy galaxy catalogue ( Fisher et ah, 1995 ), projected along one axis. The 
horizontally cones indicate the region where the observation was obscured due to 
the absorption in our own galaxy. In the right plot the absolute luminosity of a 
galaxy against its radial distance is shown, each point represents one galaxy. The 
volume limited subsample with limiting distance of 100/i _1 Mpc includes only the 
galaxies in the marked upper left corner of the figure. 



with the velocity of light c, and the Hubble parameter Ho at present time (see 
(§)). u is the radial component of the peculiar velocity, i.e. the local deviation 
from the global expansion due to inhomogcneities. Galaxy catalogues sampled 
homogeneously and with r determined independently from the redshift are still 
rare. Therefore the distance is simply estimated by 

-no 

neglecting the peculiar velocities u. This is often called "working in redshift space" . 
There is still some controversy about the actual value of the Hubble parameter 
which is parameterized by the number h: Ho = h 100km sec -1 Mpc -1 . Likely 
values are in the range h = 0.5 — 0.8. 

Furthermore we have to face another problem. The majority of galaxy catalogues 
is flux (i.e. magnitude) limited. This means that the catalogue is complete for 
galaxies with a flux higher than some minimum flux / m i n . As a first approximation 
the absolute luminosity L of a galaxy with observed flux / Q b s at distance r may be 
calculated by L = 4nr 2 f ohs . Hence at larger distances we observe only the brightest 
galaxies as can be seen in Figure |[ resulting in a systematically in-homogeneously 
sampled point-set in three dimensions. To construct a homogeneously sampled 
point set from such a galaxy catalogue we may restrict ourselves to galaxies closer 
than nim with a absolute luminosity higher than Ly lm = Airr 2 / m ; n . This procedure is 
called "volume limitation". Such a set of galaxies for r\± m = 100/i _1 Mpc is marked 
in Figure || and the spatial distribution is shown in Figure ||. Especially in the 
direction of the disc of our galaxy, in the galactic plane, we suffer from extinction 
mainly due to dust. To take care of this we use a cut of 5 to 30 degrees (depending 
on the catalogue under consideration) around the galactic plane, resulting in a 
deformed sampling window as it can be seen in Figure ^. 

The following discussion will refer to a set of points X = {x,}^ 1 . The objects 
located at these points are either galaxies, or galaxy clusters, and also super-clusters 
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Figure 6: The spatial distribution of IRAS galaxies in a volume limited sample with 
a depth of 100/i _1 Mpc, projected along one coordinate axis. This volume limited 
sample is formed by the galaxies shown in the upper left corner of the plot with 
luminosity against radial distance (Figure ^) . 



(clusters of galaxy clusters). Galaxies are well defined objects in space, with an 
extent of typically 0.03ft _1 Mpc. Similarly, galaxy clusters are well defined objects, 
clearly visible in the two-dimensional distribution of galaxies, with a typical extent 
of l-3/i _1 Mpc. Whether the combination of gala xy clusters to sup er-clusters is a 
reasonable concept is still some matter of debate ( Kerscher, 1998b| ) . 



3 Statistics of large scale structure 

New observations of our Universe will give us an increasingly precise mapping of the 
galaxy distribution around us ( |Gunn 1995 , Maddox 199S| ). But we will have only 



one realization. This makes a statistical analysis problematic, especially model as- 
sumptions like stationarity (homogeneity) and isotropy may be tested locally only. 



For an interesting discussion of such problems see Mathcron (1989). Still, global 
methods like the Minkowski functionals give us information on the shape and topol- 
ogy of this point set. 

A pragmatic interpretation is that with a statistical analysis of a galaxy cata- 
logue, one wants to constrain parameters of the cosmological models. These models 
incorporate some randomness, quantifying our ignorance of the initial conditions, or 
our limited understanding of the exact physical processes leading to the formation 
of galaxies. 



3.1 Two— point statistics 

Second-order statistics, also called two-point statistics, are still among the major 
tools to characterize the spatial distribution of galaxies. With the mean number 
density, or intensity, denoted by p, the product density 

p 2 (x 1 ,x 2 )dy(x 1 )dl/(x 2 ) = p 2 g(r) dy( Xl )dU(x 2 ) (8) 

describes the probability to find a point in the volume element dU(xi) and another 
point in dU(x 2 ), at the distance r = |xi — x 2 |; | ■ | is the Euclidean norm (we assume 
stationarity and isotropy). The product density p 2 (xi,x 2 ) is the Lebesgue density 
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of the second factorial moment measure (e.g. Stoyan et al. 1995). Often the (full) 
two-point correlation function, also called pair correlation function, g{r) and the 
normed cumulant £2(1") = g(r) — 1 are considered. Throughout the cosmological 
literature £2(7*) is also called (two-point) correlation function (Peebles, 1980). For 
a Poisson process one has g(r) = L Closely related is the correlation integral C(r) 
(e.g. Grassberger and Procaccia 1984), the average number of points inside a ball 



of radius r centred on a point of the distribution 



C(r) 



ds p 4irs 2 g(s), 



(9) 



which is related by K(r) = C(r)/p to Ripley's K function, see Stoyans's paper in 
this volume. Another common way to characterize the second-order properties is 
the excess fluctuation of the number density inside of C with respect to a Poisson 
process: 



<r 2 (c) = ^ / c dx / c dy ^i x -yD- 



(10) 



Often the power spectrum P(k) is used to quantify the second order statistical 
properties of the point distribution (Peebles, 198C). P(k) may be defined as the 
Fourier transform of ^(f) = g(f) — 1: 



P(k) 



(2tt) 3 



dx e 



6(|x|), 



(ii) 



with k 



Observed two— point correlations The first analysis of a galaxy catalogue using 
the two-point correlation function was presented by Totsuji and Kihara (1969| ). 
Following the work of Peebles (1973), today the two-point correlation function has 
become the standard tool, applied to nearly every cosmological dataset. The need 
for boundary corrected estimators was recognized early. Sever al estimators have 
been introduced, with differing claims on th e ir applicability ( Landy and Szalay 
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introduced, witn dittcrmg claims on tn e ir applicability (|Landy and ozaia} 
Hamilton 1993| , [Stoyan and Stoyan 2000 , Kerschcr 1999| , Pons-Bordcria et al 



1999 ). A clarification for cosmological applications is attempted in Kerschcr et al 



(1999b| ) 



Fig. ^ shows the (full) correlation function g{r) and the normed cumulant £2(0 
determined from a volume limited sample of the Southern Sky Redshift Survey 2 
(SSRS2; da Costa et al. 1998 ) with 1179 galaxies. The strong clustering of galaxies, 
due to their gravitational interaction, is shown by large values of g(r) and £i(r) for 
small r. 

Of special physical interest is, whether the two-point correlations are scale- 
invariant. A scale-invariant g(r) oc r D ~ 3 is an indication for a fractal distribution 
of the galaxies (Mandelbrot 1982, Sylos Labini et al. 1998|) . A scale-invariant 

« 



1999). 



is expected in critical phenomena (sec poldcnfcld 1992 , Gaite et al 



Now lets look at the log-log plot in Figure |. IWillmer et al. (1998| ) give a 
scale-invariant fit of £2(7") °^ r~ 7 with a scaling exponent 7 = 1.81 in the range of 
3-12/i _1 Mpcfor the volume limited sample with 100/i _1 Mpc. However on smaller 
scales the slope of £2 is flattening, suggesting that a scale-invariant function oc r^ 1 
gives only a poor description of the observed fair) in this SSRS2-sample. If we look 
at the correlation function g(r) in Figure ^, the observed data may be approximated 
by g(r) oc r 3 ~ D with D = 2 over the larger range from 0.5-20ft. -1 Mpc. However 
the scale-invariance of g(r) is observed over less than 2 decades only, and therefore 
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Figure 7: Estimated two-point correlation function g{r) (solid) and the normed 
cumulant £2(0 = g(f) — 1 (dashed) in a double logarithmic plot for the volume 
limited sample from the SSRS2 wit h 100/i~ 1 Mpc depth. The results of the minus 
(reduced-sample) estimator and the Fikscl (1988 ) estimator are shown, illustrating 
that only on large scales differences occur. The straight lines correspond to g(r) oc 



(solid) and £2(7") oc 



(dashed) . 



an estimate of a fractal dimension D from the scaling exponent of g(r) may be 
misleading flStoyan 1998[ pVlcCauley 1997} |McCauley l"998| , |Kerscher 1999] ) . On 
large scales the observed g(r) also deviates from a purely scale invariant model, and 
shows a tendency towards unity. This however depends on the estimator chosen. 
In this specific sample, a scale-invariant g(r) seems to be suitable, but this is not 
so clear from other data sets. Also the result on small scales might be unreliable 
due to the small number of pairs with a short separation. For a comprehensive 
analysis of the SS RS2 catalogue focusing on two-point properties and scaling see 
Cappi et al. (1998| ). 

Hence, currently we cannot exclude a scale-invariant g(r), a scale-invariant 
£2(^)7 or no scale-invariance at all, with the limited observational range provided 
by the available three-dimensional catalogues. Hopefully this controversia l issue 
will be clarified in th e near future by the advent of deeper galaxy catalogues 
1995| , |Maddox 1998| ). 



3.2 Higher moments 

The two-point correlation function plays an important role in cosmology, since the 
inflationary paradigm suggests that the initial deviations from the homogeneous 
density field may be modeled as a Gaussian random field, stochastically co mpletely 
speci fied by its mean density and its two-point correlation function (see e.g. Borncr 
1993 ). The analogous construction for point distributions is the Gauss-Poisson 
process ( Milne and Westcott, 1972 ), with subtle but important differences from 
the Gaussian random field model. However, the nonlinear evolution of the mass 
density given by (||) generates high order correlations, not explainable within a 
Gaussian model. Hence, assuming an initial Gaussian density field, these higher 
order correlations give us information on the process of structure formation. 

To investigate these nonlinear structures several methods are used. In the Sec- 
tions 3.3 and 3.4 we will focus on morphological tools like the Minkowski functionals 



and on the J-function. A geometrical method we do not cover in this text is per- 
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eolation analysis as introduced to cosmology by |5handarin (1983| ) (see also Salmi 
et al. 1997). Yet another one is the analysis based on the minimal spanning tree 
~( Barrow et al. 1985| , Doroshkcvich et al. 1999] ). A description of the direct, moment 
based methods employed in cosmology is given now (Peebles, 1980): 

As a generalization of the product density (^) one considers n-th order product 
densities 



) dV 1 ...dV n . 



(12) 



giving the probability of finding n points in the volume elements dVi to dV n , respec- 
tively. Again p n is the Lebesgue densities of the n-th factorial moment measures 



( Stoyan et al., 1995 ). In physical applications the (normalized) cumulants are often 
considered. As an example we look at the three-point correlations: 

p 3 (xi,X 2 ,X 3 ) = P 3 (l+£ 2 (|X1-X2|)+£2(|X2-X3|)+£2(|X1-X3|)+£3(X1,X2,X 3 ) 



(13) 

The three-point correlation function, i.e. the cumulant £3, describes the correlation 
of three points in addition to their correlations determined from the pairs. For 
a Poisson process all £ n with n > 2 equal zero. A general definition of the n 
point correlation functions is possible using generating functions (e.g. Daley and 
Vere- Jones 1988, Borgani 1995). Although the interpretation is straightforward, 
the application is problematic, because a large number of triples etc. are needed to 
get a stable estimate. Therefore, one looks for £„, n = 3, 4, . .. mainly in angular, 
two-dimensional, surveys (e.g. Bzapudi and Gaztanaga 1998] ); for a recent three 
dimensional analysis see Jing and Borner (1998| ). 

More stable estimates of n-point properties , but with re duced informational 
content, may be obtained using counts-in-cells ( Peebles, 198C ). For a test volume 
C, typically chosen as a sphere, we are interested in the probability P/v(C) of finding 
exactly N points in C. These Pjy(C) deter mine the one-dimens ional (marginal) 
distributions considered in spatial statistics ( Stoyan et al., 1995| ). For a Poisson 
process we have 



Pn{C) 



{p\c 



■exp(-p|C|), 



(14) 



with the volume |C| of the set C. Of special interest is the "void probability" 
Po(C), which serves as a generating functional for all the P/v(C), an d relates the 
Pn{C) with the n-point correlation function s dis cussed above (see [Stratonovich 



1963| , |White 1979] , |Daley and Vere- Jones 1988j and [Balian and Schaeffer 1989] ) . For 
a sphere B r we have Pq(B t ) = 1 — F(r) = 1 — H s (r) , with the spherical contact 
distribution F(r), also denoted by H s (r) (see Sect. 3.4). 

To facilitate the interpretation of the counts-in-cells one considers their n-th 
moments: 



(15) 



N=0 



They can be expressed by the n-th moment measures fj, n (for their definition see 
e.g. gtoyan et al. 1995): 



,C)=Y,N n P N (C). 



(16) 



N=0 
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Especially the centered moments can be related easily to the n-point correlation 
functions. As an example consider the third centered moment with N — p\C\ (e.g. 
Coles and Lucchin 1994): 



(N-N) 3 P N (C) =N + 3N 2 a 2 (C)+p 3 f dx f dy f dz£ 3 (x,y,z) (17) 

where |C| is the volume of C, and <t 2 (C) given by (|l0|). This centered moment 
incorporates information from the two-point and three-point correlations integrated 
over the domain C . One may go one step further. The factorial moments 



53 N{N - 1) • • • (N - n+ l)Pjv(C). 



(18) 



N=Q 



attracted more attention recently, since they may be estimated easier with a small 
variance ( szapudi and Szalay 199S , |Szapudi 1998]), and offer a concise way to correct 
for typical observational problems (Colombi ct al. 1998, [Bzapudi and Colombi 1996| ). 
The factorial moments may be expressed by the n-th factorial moment measures 
a n ( gtoyan et al., 1995 ) or the n-th order product densities: 



N(N-l)---(N-n+l)P N (C) =a n (C,... ,C) 

= / dxi... / dx„ p„(xi, . . . ,x„), (19) 
Jc Jc 



N=0 



yielding a simple relation with the integrated n-point correlation functions by ( |13[ ) 
and its generalizations for higher n. 

The moments and the factorial moments are well defined quantities for a station- 
ary point process. Especially the relation of the (factorial) moments to the n-point 
correlation functions in (|lj]) and (|l9| ) is valid for any stationary point process. It is 
worth to note that this does not depend on Poisson sampling from a density field 
A lot of work is devoted to relate the properties of the counts in cells with the 



dynamics of the underlying matter field (see e.g. 


Bouchet et al. 1992 




Juszkiewicz 


ct al. 1995, 


Padmanabhan and Subramanian 1993 


Bcrnardeau and Kofman 1995|). 



However, this relation is depending on the galaxy identification scheme. Typically 
the Poisson model is assumed (0). 



3.3 Minkowski functionals 



Minkowski functionals, also called Quermafi integrals are well known in stochas 
tic and integral geometry (see e.g. Hadwiger 1957, Weil 1983, [Schneider and Weil 



1992, Klain and Rota 1997). Quantities like volume, surface area, and sometimes 
also integrated mean curvature and Euler characteristic were used to describe phys- 
ical processes and to construct models. Such models and significant extensions 



of them were put into the context of integral geometry just recently (Mecke and 



Wagner 1991, Mecke 1994), see also the article by K. Mecke in this volume. The 



first cosmological application of all Minkowski functionals is due to Mecke et al 
(1994), marking the advent of Minkowski functionals as analysis tools for point 
processes. In the following years Minkowski functionals became more and more 



common in cosmology. The interested reader may consider the articles by Platzoder 


aid Buchert (1995|), |3chmalzing et al. (1996), 


Kerscher et al. (1997|), Winitzki and 


Kosowsky (1997), Schrnalzing and Buchert (1997 


), Kerscher et al. (1998]), |3chmalz- 


ng and Gorski (1998|), Novikov et al. (1999), Bcisbart and Buchert (1998|), |Sahni 
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et al. (1998), Sathyaprakash et al. (1998a ), [Hobson et al. (1999]), Sathyaprakash 
et al (1998t|), jBchmalzing et al. (1999a| ), |Schmalzing et al. (1999b| ), polgov et~al 



(1999), Schmalzing and Diaferio (1999). In the next section a short introduction to 
Minkowski functionals will be given. See also the articles by K. Mecke and W. Weil 
in this volume. 

3.3.1 A short introduction 

Usually we are dealing with d-dimensional Euclidean space M. d with the group of 
transformations G containing as subgroups rotations and translations. One can then 
consider the set of convex bodies embedded in this space and, as an extension, the 
so called convex ring TZ of all finite unions of convex bodies. In order to characterize 
a body B from the convex ring, also called a poly-convex body, one looks for scalar 
functionals M that satisfy the following requirements: 

• Motion Invariance: The functional should be independent of the body's po- 
sition and orientation in space, 

M(gB) = M(B) for any g € G, and B E TZ. (20) 

• Additivity: Uniting two bodies, one has to add their functionals and subtract 
the functional of the intersection, 

M(Bx U B 2 ) = M{B{) + M(B 2 ) - M(B X n B 2 ) for any B u and B 2 E TZ. 

(21) 

• Conditional (or convex) continuity: The functionals of convex approximations 
to a convex body converge to the functionals of the body, 

M(Ki) -> M{K) asKi^K for K, K, L E JC. (22) 

This applies to convex bodies only, not to the whole convex ring. The conver- 
gence for bodies is with respect to the Hausdorff-metric. 

One might think that these fairly general requirements leave a vast choice of such 
functionals. Surprisingly, a theorem by Hadwiger states that in fact there are only 
d + 1 independent such functionals in M d . To be more precise: 



Hadwiger's theorem (Hadwiger, 1957): There exist d + 1 functionals on the 
convex ring TZ such that any functional M on TZ that is motion invariant, additive 
and conditionally continuous can be expressed as a linear combination of them: 

d 

M = c^Aip, with numbers c M . (23) 

In this sense the d + 1 Minkowski functionals give a complete and up to a con- 
stant factor unique characterization of a poly-convex body B E TZ. The four most 
common normalizations are M M , V M , W^, and the intrinsic volumes V ^ defined as 
follows (ujfj, is the volume of the ^-dimensional unit ball): 

U, = ^Af„ F d _,= ^(f)M M) 



W M = ^-±M^ with lo 



A1 r(i + d/2) 

In three-dimensional Euclidean space, these functionals have a direct geometric 
interpretation as listed in Table |l|. 
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Table 1: The most common notations for Minkowski functionals in three- 
dimensional space expressed in terms of the corresponding geometric quantities. 



geometric quantity 




M„ 




w. 






V 


volume 





V 


V 


V 


V 


1 


A 
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surface 

int. mean curvature 
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A/8 
H/2tt 2 


A/6 
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A/3 
H/3 


A/2 
H/tt 
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7T 


X 


Euler characteristic 
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Figure 8: Randomly distributed points decorated with balls of varying radius r - 
a realization of the Booelean grain model. 



3.3.2 The germ-grain model 



Now the Minkowski functionals are used to describe the geometry and topology of 
a point set X = {xi} 1 /L 1 . Direct application gives rather boring results, V I1 (X) = 
for fi — 0,1,2 and Vs(X) = N. However, one may think of X as a skeleton of 
more complicated spatial structures in the universe (see e.g. Fig [|). Decorating 
X with balls of radius r puts "flesh" on the skeleton in a well defined way. Also 
non-spherical grains may be used. 

The Minkowski functionals for the union set of these balls A r — UfcLi -Br( x i) 
give non-trivial results, depending on the point distribution considered. We will 
use r as a diagnostic parameter specifying a neighborhood relations, to explore the 
connectivity and shape of A r . 

Let X be a finite subset of a realization of a Poisson process inside some finite 
domain W. Then A r is a part of a realization of the Boolean grain model, illustrated 
in Figure @. For these randomly placed balls the mean volume densities m M of the 



Minkowski functionals are known (e.g. Mecke and Wagner 1991, Schneider and Weil 
1992, also called intensities of Minkowski functionals). 



m (A r ) = l-e- pMo , m 2 {A r ) 



-PM (M 2 p-Mlp 2 ), 



mi{A r ) 



-p m ° Mip, m 3 (Ar)= e~P M ° (Msp-iM^p 2 + Mfp 3 ), 



(24) 



with the number density p and 
4tt 



M = — r\ Mi = -r z 



M 2 = -r, 

TT 



(25) 



Starting from a general poi nt process, decorat ing it with spheres, we arrive at 
the germ-grain model (see also Stoyan et al. 1995 ). The Minkowski functionals or 
their volume densities calculated for the set A r may be use as tools to describe 
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the underlying point distribution, directly comparable to standard point process 
statistics like the two-point correlation function (Sect. |3.1[ ) or the nearest neighbor 
distribution (Sect. 3.4). Indeed, the volume density mo(A r ) equals the spherical 
contact distribution or equivalently the void probability minus one: mo(A r ) = 
F(r) — H s (r) = Pq(B t ) — 1 (see also Sect. 3.2). Expressions relating the Minkowski 
functionals of such a set A r , with the n-point correlation functions of the underlying 
point -process may be found in Mecke (1994| ) , Mecke et al. (1994| ) , and [gchmalzmg 
et al. (1999b) and the contribution by K. Mecke in this volume. 

Already for moderate radii r nearly the whole space is filled with up by A ri 
leading to mo(A r ) ps 1 and m^(A r ) w 0, with /i > 0. This illustrates the different 
role the radius r plays for the Minkowski functionals compared to the distance r 
as used in the two-point correlation function g(r). Already for a fixed radius, the 
Minkowski functionals of A r are sensitive to the global g eome try and topology of 
A r and, hence, of the decorated point set (see also Sect. 3.3.5 ). Indeed point sets 
with an identical two-point correlation function, but with clearly different large 
scal e morphology may be generated easily (see e.g. Baddclcy and Silverman 1984 , 
and fczalay 1997) ). 

All galaxy catalogues are spatially limited. To estimate the volume densities of 
Minkowski functionals for such a realization of the germ-grain model given by the 
coordinates of galaxies, we use boundary corrections based on principal kincmatical 
formula (see Mecke and Wagner 1991, stoyan et al. 1995, 3chmalzing et al. 1996): 



m^Ar) 



M^A r n W) 
M (W) 



|U-1 

E 

i/=0 



m u {A r 



M (WO 



(26) 



We use the convention Yjn=i Xn = - 1 < ^ ^ n exam pl e illustrating these 



boundary corrections is given in Kcrschcr et al. (1996a ). 

In the following an application of these methods to a catalogue of galaxy clusters 
(Kerscher et al., 1997) (an earlier analysis of a smaller cluster catalogue was already 
given by Mecke et al. 1994) and to a galaxy catalogue will illustrate the qualitative 
and quantitative results obtainable with global Minkowski functionals. 



3.3.3 Cluster catalogues 

The spati al distribution of centers of ga laxy clusters, using the Abell/ACO cluster 
sample of Plionis and Valdarnini (1991 ), was analyzed with Minkowski functionals 
applied to the germ-grain model ( Kerscher et al., 1997 ). At first a qualitative 
discussion of the observed features is presented, followed by a comparison with 
models for the cluster distribution. 

The most prominent feature of the volume densities of all four Minkowski func- 
tionals are the broader extrema for the Abell/ACO data as compared to the results 
for the Poisson process (see Fig. . This is a first indication for enhanced cluster- 
ing. Let us now look at each functional in detail: 

The density of the Minkowski functional mo measures the density of the covered 
volume. On scales between 25/i _1 Mpc and 40/i -1 Mpc, too as a function of r lies 
slightly below the Poisson data. The volume density is lower because of the clump- 
ing of clusters on those scales. 

The density of the Minkowski functional toi measures the surface density of the 
coverage. It has a maximum at about 20ft.~ 1 Mpc both for the Poisson process and 
for the cluster data. This maximum is due to the granular structure of the union 
set on the relevant scales. At the same scales, we find the maximum deviation from 
the characteristics for the Poisson process. The lower values of mi for the cluster 
data with respect to the Poisson are again an indication of a significant clumping 
of clusters at these scales. The functional mi shows also a positive deviation from 
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Figure 9: Densities of the Minkowski functionals for the Abell/ACO (solid line) and 
a Poisson process (shaded area) with the same number density. The shaded area 
gives the statistical variance of the Poisson process calculated from 100 different 
realizations. 



the Poisson on scales of (35 . . . 50)/i _1 Mpc where more coherent structures form 
in the union set than in the Poisson process, keeping the surface density larger. 
The densities of the Minkowski functionals m.2 and TO3 characterize in more detail 
the kind of spatial coverage provided by the union set of balls in the data sample. 
The density of the total mean curvature m 2 of the data reaches a maximum at 
about 10ft. _1 Mpc produced by the dominance of convex (positive m-i) structures. 
The density m 2 at the maximum is reduced with respect to the Poisson process to 
about 70% (or more than three standard deviations). The integral mean curvature 
m 2 has a zero at a scale of 25/i _1 Mpc (almost the scale of maximum of mi) corre- 
sponding to the turning-point between structures with mainly convex and concave 
boundaries (negative m 2 ). Significant deviations from the Poisson process occur 
between this turning point and 40/i _1 Mpc due to the smaller mean curvature of the 
union set of the data, probably caused by the interconnection of the void regions in 
the cluster distribution. 

The density of the Euler characteristic m-z describes the global topology of the 
cluster distribution. On small scales all balls are separated. Therefore, each ball 
gives a contribution of unity to the Euler characteristic and 7713 is proportional to 
the cluster number density. As the radius increases, more and more balls overlap 
and 77J3 decreases. At a scale of about 20ft _1 Mpc it drops below zero due to the 
emergence of tunnels in the union set (a double torus has x = — !)■ The positive 
maximum for the Poisson process at scales ~ 40ft.~ 1 Mpc is the signature for the 
presence of cavities. The nearly linear decrease of the Euler characteristic for the 
Abell/ACO sample indicates strong clustering on scales < 15/i _1 Mpc. The lack of 
a significant positive maximum after the minimum shows that only a few cavities 
form. This suggests a support dimension for the distribution of clusters of less than 
three. The presence of voids on scales of 30 to 45/i _1 Mpc is shown by the enhanced 
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Figure 10: Densities of the Minkowski functional for the Abell/ACO (solid line in 
both panels) compared to the SCDM (shaded area in top panel) . The shaded area 
gives ler-error bars of the variance among different realizations. 



surface area mi and the reduced integral mean curvature 7712, while on these scales 
the Eulcr characteristic is approximately zero. 



The emphasis of Kerscher et al. (1997) was on the comparison with cosmological 



model predictions. For this purpose artificial cluster distributions were constructed, 
from the density field of N-body simulations. Such simulations are still quite costly 
and therefore only four specific models were investigated. In Fig. |l^ the comparison 
of the observations with the Standard Cold-Dark-Matter (SCDM) model is shown. 
This model shows too little clustering on small scales, as it is clearly seen by the 
enhanced maxima of the surface area mi and the integral mean curvature ni2, as 
well as in the flatter decrease of the Euler characteristic 7713. Additionally, the higher 
volume mo indicates weak clumping and to few coherent structures also on large 
scales. These deviations may be quantified using some norm for the comparison 



of the observational data with the model prediction (for details see Kerscher et al 



1997 ). A comparison of the clusters distribut ion with CDM mode ls using the power 



spectrum (|ll|) lead to a similar conclusions (Rctzlaff et al., 1 



3.3.4 Large fluctuations 

A physically interesting point is how well defined are the statistical properties of 
the galaxy or cluster distribution, determined from one spatially limited realization 
only. Or in other words, how large arc the fluctuations of the morphology for 
a domain of given size? |Kerscher et al. (1998 ) investigate this using Minkowski 



functionals, the J-function (see section 3.4), and the two-point statistic a 2 ( |To| ) 



By normalizing with the functional M/j,(B r ) of a single ball we can introduce 
normalized, dimensionless Minkowski functionals &^(A r ), 

*" <Ar) = Jm^STy < 27 > 
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Figure 11: Minkowski functionals </> M of a volume limited sample with 100/i _1 Mpc 
depth extracted from the IRAS 1.2 Jy catalogue; the dark shaded areas represent 
the southern part, the medium shaded the northern part, and the dotted a Poisson 
process with the same number density. The shaded areas are the la errors estimated 
from twenty realizations for the Poisson process and from twenty errors using a 
Jackknife procedure with 90% sub-sampling, for the data. 



where p is the number density. In the case of a Poisson process the exact mean 
values are known (|24|). For decorating spheres with radius r one obtains: 



^-n) $f - , 

32 'U> Y 3 ~ V" 1 32 



$p =e -" (1 * 3 P =e _ " (1 ~ 377 + ^-ry 2 ), 



(28) 



with the dimensionless parameter r) = pMo(B r ) = p 47rr 3 /3. For p > 1 the measures 
<S>/j,(A r ) contain the exponentially decreasing factor e~ n ^ r \ We employ the reduction 

^=Irjty< ^ (29) 

and thereby remove the exponential decay and enhance the visibility of differences 
in the displays shown below. 

We now apply the methods introduced above to explore a redshift catalogue of 



5313 IRAS selected galaxies with limiting flux of 1.2 Jy ( Fisher et al., 1995 ). A 
volume limited sample of 100/i _1 Mpc depth contains 352 galaxies in the northern 
part, and 358 galaxies in the southern part (with respect to galactic coordinates), 
as shown in Fig. ^. As far as the number density, i.e. the first moment of the galaxy 
distribution is concerned, the sample does not reveal significant differences between 
north and south. However, we want to assess the clustering properties of the data 
and, above all, tackle the question whether the southern and northern parts differ or 
not. A characterization of the global morphology using the Minkowski functionals 
(Fig. [fl]) shows that in both parts of the 1.2 Jy catalogue the clustering of galaxies 
on scales up to 10ft. _1 Mpc is clearly stronger than in the case of a Poisson process, as 
inferred from the lower values of the functionals for the surface area, (f>\ , the integral 
mean curvature, 4>2, and the Euler characteristic, 4>3- Moreover, the northern and 
southern parts differ significantly, with the northern part being less clumpy. The 
most conspicuous features are the enhanced surface area <j)\ in the southern part 
on scales from 12 to 20/i _1 Mpc and the kink in the integral mean curvature 4>2 
at 14/i~ 1 Mpc. This behavior indicates that dense substructures in the southern 
part are filled up at this scale (i.e. the balls in these substructures overlap without 
leaving holes). 

These strongly fluctuating clustering properties are also visible in the J-function 
(Sect. |3.4j), and the a 2 {B r ) (see (ffOH). An analysis of possible contaminations 



and systematic selection effects showed that these fluctuations are real structural 
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Figure 12: The black set marks the excursion set Q v of a Gaussian density field 
with increasing v from left to right. Only the highest peaks remain for large v. 



differences in the galaxy distribution on scales of 100/i _1 Mpc even extending to 
200/i _1 Mpc (see also Kerscher et al. 1996b). It is interesting to note that an N- 
body simulation in a periodic box with side-length of 250/i _1 Mpc (Kolatt et al. 



1996 ) was not able to reproduce these large-scale fluctuations. 



3.3.5 Minkowski functionals of excursion sets 



In the preceding section the Minkowski functionals were used to characterize the 
union set of balls, the body A r . Consider now a smooth density or temperature 
field it(x). We wish to calculate the Minkowski functionals of an excursion set Q u 
over a given threshold v (see Fig. H2) , defined by 



Q u = {x | it(x) > v}. 



(30) 



This threshold v will be used as a diagnostic parameter. The geometry and topology 



of random fields u(x) and their excursion sets was studied extensively by Adler 
[1981). Two complementary calculatio n methods for the Minkowski fu nctionals of 
the excursion set Q v were presented by Bchmalzing and Buchcrt (1997 ) 



Starting with a given point distribution a density field may be constructed with 
a folding employing some kernel k e (z) of width e 



N 



*(y) = Y1 fc e(x 2 -y). 



(31) 



i=l 



Often a triangular or a Gaussian kernel sometimes with an adaptive smoo thing scale 
e (y) a re used. A discussion of smoothing techniques may be found in Silverman 



:i986[) . 

The Euler characteristic x °f the excursion set is directly related to the genus 
G of the iso-density surface separating low from high density regions: 



G{dQ u ) = 1 - 2 X (Qu). 



(32) 



The analysis of cosmological density fiel d using the genus of i so density surfaces 



is a well accepted tool in cosmology (see Weinberg et al. 1987, Melott 199C, Coles 



t al. 1996 and refs. therein), now incorporated in the more general analysis using 
Minkowski functionals. Especially the Euler characteristic of excursion se ts has also 
applications in other fields like medical image processing (Wbrsley, 1998). 
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3.3.6 Gaussianity of the cosmic microwave background 



As already mentioned in Sect. 3.2 it is physically very interesting, whether the ob- 
served fluctuations in the temperature field of the cosmic microwave background 
radiation (CMB), as shown in Fig. |l|, are compatib le with a Gaussian random field 
model. For a Gaussian random field Tomita (1986) obtained analytical expressions 
for the Minkowski functionals of Q v in arbitrary dimensions. Since the tempera- 
ture fluctuations are given on the celestial sphere, an adopted integral geometry for 
spaces with constant curvature must be used (3antal6 1976). Schmalzing and Gorski 



(1998) took this geometric constraint and further complications due to boundary 



and binning effects, as well as noise contributions into account. They find no signif- 
icant deviation from a Gaussian random field for the resolution of the COBE data 
set. 



Other methods to test for Gaussianity are based on a wavelet analysis (Hobson 



ct al., 1999) on high-order correlation functions (Heavens, 1999) or on the two 
point correlation function of peaks in the temperature fluctuations (Heavens and 
Shcth, 1999]). 



3.3.7 Geometry of single objects shape finders 

Looking at high thresholds v, the excursion set is mainly composed out of sep- 
arated regions (see Fig. |l2|). The morphology of these regions may be charac- 
terized using Minkowski functionals and the derived shape-finders ( |5ahni et al. 
1998). Employing the following ratios of the Minkowski functionals Hi = Vq/(2Vi) 
H 2 = 2Vi/(ttV 2 ) and H3 = 3V 2 /(4V3) one may construct the dimensionless shape- 
finders planarity P and filamentarity F 



P = 



H 2 — Hi 
H 2 + Hi 



and F 



H 3 — H 2 
H3 + H 2 



(33) 



A simple example (Schmalzing et al., 1999a) is provided by a cylinder of radius r 
and height Xr with the Minkowski functionals 



V = irr 3 \, Vi = lr 2 {l + X), V 2 = \r(ir + A), V 3 = 1. 



(34) 



The shape-finders planarity P and filamentarity F for this specific example are 
plotted against each other in Fig. O. Indeed this is nothing else but an inverted 



factors. With 



one obtains 



Blaschke diagram for the form factors (Hadwiger 1955, Schneider 1993). Following 



schmalzing et al. (1999a) the shape-finders may be written in terms of the form 



4V? 



8V1V3 
3ttV 2 2 



(35) 



P 



and F 



(36) 



The isoperimetric inequalities ( [Schneider, 1993 ) assure that < P, F < 1 for convex 
bodies. For a sphere one gets P = = F. 

One of the results obtained with the shape-finders applied to single objects in 



the excursion sets of N-body simulations (Schmalzing et al., 1999a) is given in 
Fig. [l3| This histogram shows that the majority of the regions inside the excursion 
set has P w « F, and a smaller fraction has P « 0, F > 0, whereas only a few 
of the regions have F « P > 0. Interpreting regions with e.g. P « 0, F > as 
filamentary or line-like structures is tempting but dangerous, since also non-convex 
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Figure 13: On the left side a plot of the shape-finders for the cylinder with varying 
A is shown, illustrating the turnover from A « 0, a plane geometry (P « 1, F -C 1), 
through a roughly spherical (P « 0, F « 0) to a mainly line like geometry (P<Cl, 
F w 1) for A 1. On the right side a frequency histogram of the shape-finders 
determined from the excursion sets of an N-body simulation is shown. Larger 
circles correspond to more objects within the shape-finder bin (from |Schmalzing 
et al. 1999a|). 



regions are considered. Also, the histogram was constructed from the excursion sets 
of all thresholds under consideration. 

It does not seem to be possible to construct shape-finders based on the global 
scalar Minkowski functionals facilitating a unique interpretation for non-convex 
sets. Abandoning the density field approach, and going back to the germ-grain 
model, and the Minkowski functional of a union set of balls A r = (Ji=i B r (xi), one 
may assign a partial Minkowski functional to each ball. These partial Minkowski 
functionals may be used to extract information on the spatial structure elements - 



whether the ball around is inside a cluster, a sheet or a filament (see Meckc 1994, 
Platzoder and Buchert 1995, Schmalzing and Diaferio 1999). Another promising 
global method for extracting shape and symmetry information from non- convex 
bodies is provided by the global Quermafi vectors ( Bcisbart ct al., 2000| ). 



3.3.8 Other applications of Minkowski functionals 

In the preceding applications we analyzed the union set of balls A r or the excursion 
set Q v with Minkowski functionals. Another possibility is to consider Minkowski 
functionals of the Delauney- or Voronoi-cells, as determined from the correspond- 



ing tesselation defined by the given point distribution (Muche 1996, Muche 1997, 
Kerscher 1998a|). 



Going beyond motion invariance, instead demanding motion equivariance, one 
can construct vector- valued extensions of the Minkowski functionals, the Quermafi 
vectors (Hadwiger and Schneider 1971, Bcisbart ct al. 1999| ) . Bcisbart ct al. (2000[ ) 



investigate the dynamical evolution of the substructure in galaxy clusters using 
Quermafi vectors (see also Beisbart and Buchert 1998). 
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3.4 The J function 



Other methods to characterize the spatial distribution of points, well known in spa- 
tial statistics, are the spherical contact distribution F(r) (also denoted by H s (r)), 
i.e. the distribution function of the distances r between an arbitrary point and the 
nearest object in the point set X, and the nearest neighbor distance distribution 
G(r), that is defined as the distribution function of distance r of an object in X to 
the nearest other object in X. F(r) is related to the void probability P n (B r ) by 
F(r) = 1 — Po(B r ). For a Poisson distribution it is simply 



47T 

G{r) = F(r) = 1 - cxp ( -p y r 3 



Recently, van Licshout and Baddclcy (1996| ) suggested to use the ratio 

1 - G(r) 



J(r) = 



1 - F(r) 



(37) 



(38) 



as a further distributional chara cteristic. For a Poisson distributio n J(r) = 1 follows 
directly from (|37|). As shown by van Licshout and Baddeley (1996 ), a clustered point 
distributi on implies J{r) < 1, whereas regul ar structures are indicated by J(r) > 1. 
However, Bedford and van den Berg (1997 ) showed that J = 1 does not imply a 
Poisson process. For several point process models J (r), or at least limiting values for 
J( r), are known ( van Licshout and Baddeley, 1996] ). The J function was cons idered 
by |White (1979| ) as the "first conditional correlation function" and used by |Sharp| 
(1981) to test hierarchical models. The relation between J(r) and the cumulants 



£ n (r) was used by Kerscher (1998b). An empirical study of the performance of the 
J-function for several point process models is given by Fhonnes and van Licshout 
(1999). A refined definition of the J-function "without edge correction" may be 
especially useful for a test on spatial randomness (Baddeley et al., 1999). 



3.4.1 Clustering of galaxies 

The J-function may be used to characterize the distribution of galaxies or galaxy 
clusters and for the comparison with the results from simulations, similar to the 



application of the Minkowski functionals in sect. 3.3.3. This approach was pursued 



by Kerscher ct al. (1999a). The Perseus-Pisces redshift survey (Wegner et al. 1993 
and refs. therein) was compared with galaxy samples constructed from a mixed 
dark matter simulation. The observed J if) determined from a volume limited sam- 
ple with 79/i _1 Mpc depth differs significantly from the results of the simulations 
(Fig. |l4|). Especially on small scales the galaxy distribution shows a stronger clus- 
tering, as seen by steeper decreasing J(r). We also could show that modeling the 
galaxy distribution with a simple Poisson cluster process is not appropriate. 



3.4.2 Regularity in the distribution of super clusters? 

Finasto et al. (1997b) report a peak in the 3D-power spectrum (the Fourier trans- 
form of £2) °f a catalogue of clusters on a scale of 120/i _1 Mpc. Broadhurst et al 
(1990) observed periodicity on approximately the same scale in an analysis of ID- 
data from a pencil-beam redshift survey. As is well known from the theory of fluids, 
the regular distribution (e.g. of molecules in a hard-core fluid) reveals itself in an 
oscillating two-point correlation function and a peak in the structure function re- 
spectively (see e.g. Hansen and McDonnald 1986, and the contribution of H. Lowen 
in this volume). In accordance with this an oscillating two-point correlation func- 
tion £2 (f) or at leas t a first peak was rep orted on approximately the same scale (e.g. 
Mo ct al. 1992 and Einasto ct al. 1997al) . The existence of regularity on large scales 
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Figure 14: J{r) for the volume limited sample from Perseus-Pisces redshift survey 
(solid line) and the ler range determined from galaxy samples generated by a mixed 
dark matter simulation. 



implies a preferred scale in the initial conditions, which would be of major physical 
interest. 



Usin g the J(7-)-function Kerscher (1998b ) investigates the super-cluster cat- 
alogue ( Einasto et al., 1997c) construct ed from an earlier version of the cluster 
catalogue by Andcrnach and Tago (1998 ) using a friend-of-friends procedure. (The 
friend-of-friends procedure is called single linkage clustering in the mathematical 
literature). Comparing with Poisson distributed points one clearly recognizes that 
the super-cluster catalogue is a regular point distribution (Fig. [To] ). However, a sim- 
ilar signal for J(r) may be obtained by starting with a Poisson process followed by a 
friend-of-friends procedure with the same linking length as used in the construction 
of the super-cluster catalogue. Only some indication for a regular distribution on 
large scales remains, showing that this super-cluster catalogue is seriously affected 
by the construction method. 



3.4.3 G n and F n 

As a direct generalization of the nearest neighbor distance distribution one may 
consider the n— th neighbor distance distributions G„(r), the distribution of the 
distance r to the n-th nearest point (e.g. Stoyan and Stoyan 1994). For a Poisson 
process in three dimensions we have 



G«(r) 



1 



(39) 



shown in Fig. [l6| T(n, x) — f^ds s n ~ 1 e~ s is the incomplete Gamma-function, 
r(n) = r(n, 0) the complete. Clearly G±(r) = G(r). In Fig. [l6| the curves for the 
first five G n (r) for a Poisson process are shown, together with their densities p n (r) 
defined by 



G n (r) 



ds p n (s). 



(40) 



22 



0.5 



20 40 60 

r [h _1 Mpc] 



Figure 15: J(f) determined from the super-cluster sample (solid line) is shown 
together with the 1— a range determined from a pure Poisson process (dotted area) 
and a Poisson process followed by a similar friend-of-friends procedure (dashed 
area) as used to construct the super-cluster catalogue. 



The sum of these densities is directly related to the two-point correlation function 



(Mazur, 1992) 



g(r) p 4nr 2 = p n (r). 



(41) 



The n-th spherical contact distribution F n (r) is the distribution function of the 
distances r between an arbitrary point and the n-th closest object in the point set 
X (we assume that the n-th closest point is unique). Clearly Fi(r) = F(r). For 
stationary and isotropic point processes F n (r) is the probability to find at least n 
points inside a sphere B r with radius r, and therefore 



(42) 



where Pi(B r ) are the counts-in-cells as discussed in Sect. |3.2 . 

For a Poisson process with number density p we obtain directly from ( fTi)) 



F n (r) = l-eM-p\B r \)J2 



(p\B r \y 



(43) 



which is essentially the series expansion of the incomplete gamma function (see e.g. 
Abramowitz and Stegun 1984). Therefore, 



(44) 



and we explicitely see that for a Poisson process 

F n (r) = G n (r) (45) 
This is a special case of the "Slivnyak's theorem" ( jStoyan et al., 19~9l ) . 
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Figure 16: In the left plot you see the G n (r) with n = 1, . . . , 5 for a Poisson process 
with p — 100. In the right plot the corresponding densities p n (V) are shown. 

A very interesting feature of the G„(r) and F n (r) is their sensitivity to structures 
on large scales increasing with n. As an illustration consider the interval A ra C R + 
specified by J A ds p n (s) — 0.9. Then A„ is the interval in which 90% of the 
distances to the n-th neighbor lie. (The choice of 0.9 is arbitrary and may certainly 
be adopted to the problem considered. Also the interval A„ is "centered" as shown 
in Fig. The empirical G n (r) may be used to probe structures within this 

specific radial range as illustrated in Fig. |l6|. Going to larger n one considers 
distance intervals for larger radii. 



3.4.4 The J„ function 

A drawback of the J(r)-function in empirical investigations is that it becomes ill 
defined for large radii, since the empirical F(r) reaches unity and the quotient in 
(38) diverges. In the following we will discuss the straightforward generalization of 
the J-function (|3^), introducing the J„(r) functions: 



Mr) = 



l-G n {r) 



l-Fn(r) 

From ([l5]) we obtain directly for a Poisson process 

Jn{r) = 1 for all n 



(46) 



(47) 



Qualitatively we expect the same behavior of the J n (r) -functions as for the J{r)— 
function, but now for a radius r in the interval A„ (defined at the end of Sect. 3.4.3j ). 



• If a point distribution shows clustering on scales r in A„, the G n (r) increases 
faster than for a Poisson process since the n-th nearest neighbor is typically 
closer. F n (r) increases more slowly than for a random distribution. Both 
effects result in a J„(r) < 1, 

• On the other hand, for a point distribution regular on the scale r in A„, G n (r) 
increases more slowly than for a Poisson process, since the n-th neighbor is 
found at a finite characteristic distance. F n (r) increases stronger since the 
distance from a random point to the n-th closest point on the regular structure 
is typically smaller. This results in J n (r) > 1. 
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Figure 17: The J n (r) with n = 1, ... 10 (bending up successively) for a Matern 
cluster process with fi = 10 and R = 1.5/i _1 Mpc calculated using the reduced 
sample estimators. 



• 4(f) — 1 indicates the transition from regular to clustered structures on 
scales r in A„. 

With a simple point process model we illustrate these properties. In a Matern 
cluster process a single cluster consists out of \i points in the mean, randomly 
distributed inside a sphere of radius R, where the number of points follows a Pois- 
son distribution. The clusters centers ( not belonging to th e point process) form a 
Poisson process with a density of ( ^toyan et al., 1995 ). In Fig. [l?] the strong 



clustering in the Matern cluster process is visible from a decline of the J n (r). This 
decline becomes weaker with increasing n. For large radii r the J n acquire a con- 
stant value. Investigating larger scales, i.e. for large n, the constant value of J n 
shows a trend towards unity, i.e. we start to "see" the Poisson distribution of the 
clusters centers. 



3.4.5 On our way to large scales 

A similar behavior may be identified in the galaxy distribution. We calculate the 
J n -functions for a volume limited sample of galaxies extracted from the IRAS 1.2 Jy 
catalogue with 200/i _1 Mpc depth using the reduced sample estimator for both F n 
and G n . For small n, i.e. small scales, the J n (r) are all smaller than unity, indicating 
clustering out to scales of 40ft.~ 1 Mpc(see Fig. [IJ|). For large n the J n are consistent 
with no clustering, i.e. J n = 1. However a trend towards a J n larger than unity, 
indicating regularity is observed. Clearly, the results obtained from this sparse 
sample with 280 galaxies only may serve mainly as an illustration of the method - 
to obtain decisive results we will have to wait for deeper surveys. 



4 Summary and Outlook 

In Sections |3.3.3| and |3.4.1| we discussed that advanced geometrical methods like 
the Minkowski functionals and the J-function are able to constrain parameters 
of cosmological models. However, these geometric methods are not only limited 
to the parameter estimation in cosmological simulations, they are also valuable 
tools as point process statistics in general. The direct probe of galaxy surveys 
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Figure 18: In the left plot the J n of the IRAS galaxies are shown with n = 1, 4, 7 
(solid, dotted, dashed), in the right plot the J n with n = 10, 15,20 (solid, dotted, 
dashed) . 



with geometrical methods showe d tha t the large-scale structure exhibits strong 
morphological fluctuations (Sect. |3.3.4 ). Such fluctuations are often attributed to 
"cosmic variance" in an Universe homogeneous on very large scales. However the 
fluctuations are astonishingly large even on scales of 200/i _1 Mpc. A preferred scale, 
may be viewed as an indication for a homogeneous galaxy distribution on large 
scales. Especially geometric methods like the J- and J n -functions may be helpful 
to identify a preferred scale in the galaxy distribution. 

Perspectives for future research might be as follows: 
Starting with the Minkowski functionals or other well founded geometrical tools, 
more specialized methods may be constructed to understand certain features in the 
galaxy distribution in detail. An example are the vector valued extension of the 
Minkowski functionals, the QuermaB-vector, used in the investigation of the sub- 
structure in galaxy clusters. 

In empirical work, one has to determine these geometrical measures from a given 
point set. The construction of estimators with well understood distributional prop- 
erties is crucial to be able to draw decisive conclusions from the data. 
Using these geometrical methods as tools for constraining the cosmological param- 
eters will be one way to go. Currently this is mainly performed by comparisons 
with N-body simulations. Clearly a more direct link between the geometry and the 
dynamics of matter in the Universe promoting our understanding how structures 
form is desirable. Carefully constructed approximations may be the key ingredient. 
Another way in trying to understand structure formation is to directly investigate 
the appearance of geometric features like walls, filaments, and clusters - or to iden- 
tify a preferred scale showing up in a regular distribution on large scales. Such 
findings will guide us in the construction of approximations, which are able to re- 
produce such geometric features. 
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